Wl and W2 Theories, and Their Variants: 
Thermochemistry in the kJ/mol 
Accuracy Range 



Jan M.L. Martin and S. Parthiban 

Department of Organic Chemistry, 
Weizmann Institute of Science, 
IL-76100 Rehovot, Israel 
E-mail: comartinOwicc .weizmann. ac . il 



Chapter 2, pp. 31-65, In: 
Quantum Mechanical Prediction of Thermochemical Data, 
ed. J. Cioslowski and A. Szarecka; 
Understanding Chemical Reactivity Series, Vol. 22; 
Kluwer Academic Publishers, Dordrecht, The Netherlands, 2001 
ISBN 0-7923-7077-5; 



Chapter 2 



Wl and W2 Theories, and Their Variants: 
Thermochemistry in the kJ/mol Accuracy Range 



Jan M.L. Martin and S. Parthiban 

Department of Organic Chemistry, Weizmann Institute of Science, Kimmelman 
Building, IL-76100 Rehovot, Israel 



1. INTRODUCTION AND BACKGROUND 

The last fifteen years witnessed tlie development of a number of 
"black-box" computational thermochemistry methods. Among them, 
the G1/G2/G3 theories and their variants, and the CBS-Q family of 
methods by Petersson and coworkers are worth mentioning in particu- 
lar. In addition to these wavefunction-based approaches, density func- 
tional methods - aside from their great popularity as a general tool 
for practical computational chemistry - have gained some currency for 
computational thermochemistry in the medium accuracy range, as have 
group equivalent-based models. For very large systems, semiempirical 
methods remain popular. 

At the other extreme in terms of system size and accuracy stand 
brute-force approaches such as those based on wavefunctions with ex- 
plicit interelectronic distances. 

Methods such as G3 and CBS-QB3 do reach the goal of "chemical 
accuracy" (generally defined as ±1 kcal/mol) on average, but worst-case 
errors for problematic molecules may exceed this criterion by almost an 
order of magnitude. In addition, almost all of these approaches involve 
some level of parameterization and/or empirical correction against ex- 
perimental data. While this is by and large possible (albeit not without 
pitfalls) in the kcal/mol accuracy range for first-and second-row com- 
pounds, experimental data of sub-kcal/mol accuracy are thin on the 
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ground, and the available data for transition metal compounds are sim- 
ply too scarce for this to be a useful approach. 

There would thus appear to be room for a more or less "black box" 
computational thermochemistry method that has the following proper- 
ties: 

1. it on average achieves "benchmark accuracy", which we shall ar- 
bitrarily define as one unit of the most common tabulation unit in 
thermochemical reference tables, i.e. 1 kJ/mol (0.24 kcal/mol); 

2. the worst-case error should not exceed 1 kcal/mol ("chemical ac- 
curacy") except perhaps in intrinsically pathological cases; 

3. it is still efficient enough for applications to systems with up to six 
heavy atoms on modern workstations; 

4. it is entirely devoid of parameters derived from experiment (and 
hence from bias towards the systems used for parameterization). 

These have been the design goals in our development of the Wl and W2 
(Weizmann-1 and Weizmann-2) theories [1]. 

The usual design philosophy for this type of methods is bottom-up: 
one starts with an approximate model, compares results with experi- 
ments, analyzes the deviations, and uses them to determine empirical 
corrections and/or additional terms to be added to the model, after 
which the cycle is repeated if desired. 

Our philosophy was instead "top-down" . Wc decomposed the molec- 
ular TAE (total atomization energy: TAEg at the bottom of the well, 
TAEo at absolute zero) into all components that can reasonably affect it 
at the kJ/mol level. Then we carried out exhaustive benchmark calcu- 
lations on each component separately for a representative "training set" 
of molecules. Finally, for each component separately, we progressively 
introduced approximations up to the point where reproduction of that 
particular component started deteriorating to an unacceptable extent. 
Thus, experimental data entered the picture only at the validation stage, 
not at the design stage. 

Another philosophical issue centers on whether a method should be 
a "protocol" specified down to the last detail (i.e. be truly "black-box"), 
or whether it should merely outline a general approach with minor de- 
tails to be decided on a case-by-case basis. Obviously a method where 
empirical parameterization is kept to the absolute minimum or is ab- 
sent altogether will offer more 'degrees of freedom' in this regard than 
the one where a minor change in the protocol would, for consistency, 
require reparameterization against a large experimental data set. Yet 
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our general guideline was that, while such choices should be possible for 
an experienced computational chemist, they should not be an essential 
part of the process itself. 

2. STEPS IN THE Wl AND W2 THEORIES, 

AND THEIR JUSTIFICATION 

The more cost-effective Wl theory and the more rigorous W2 the- 
ory have a lot of points in common. Aside from issues relating to the 
reference geometry and the zero-point energy, the main difference con- 
cerns the basis sets used in the extrapolation steps for the SCF and the 
valence correlation contribution. 

These basis sets belong to the "correlation consistent" family of 
Dunning and coworkers [2, 3]. The correlation consistent (cc) basis sets, 
besides being arguably the most compact ones in their accuracy range [4] , 
have the important property that, by design, they treat radial and an- 
gular correlation in a balanced way. In addition to the regular cc-pVnZ 
(correlation consistent polarized valence n-tuple zeta, or VnZ for short) 
basis sets, several variants have been published. In particular we note 
the aug-cc-pVnZ or AVnZ basis sets [5] for anions (with the combina- 
tion of regular cc-pVnZ on hydrogen and aug-cc-pVnZ on other elements 
generally being denoted aug'-cc-pVnZ [6], or A'VnZ for short), the MT 
(Martin- Taylor [7, 8]) and cc-pCVnZ [9] basis sets for inner-shell corre- 
lation, and the cc-pVnZ+1 [10], cc-pVnZ+2dlf [11], and (most recently) 
cc-pV(n+d)Z[12] basis sets for second-row atoms exhibiting 'inner po- 
larization' [11] (vide infra). 

We consider here the following sequence of correlation consistent 
basis sets: A'VDZ+2d, A'VTZ+2dlf, A'VQZ+2dlf, and A'V5Z+2dlf, 
which we shall denote "small", "medium", "large", and "extra large" 
(for first-and second-row compounds, these basis sets are of spd, spdf, 
spdfg, and spdfgh quality, respectively). Wl theory, then, carries out all 
extrapolations using " small" , " medium" , and " large" , while W2 theory 
employs "medium", "large", and "extra-large" basis sets. 

The Wl and W2 protocols for obtaining the total atomization en- 
ergy (TAE) of a given molecule involve the following steps: 

1. Geometry optimization at the B3LYP/VTZ-I-1 level for Wl, and 
at the CCSD(T)/VQZ+1 level for W2. 

2. Extrapolation of the SCF component of TAE from the "small", 
"medium", and "large" basis sets (Wl) or "medium", "large", and 
"extra-large" basis sets (W2), by means of either the geometric 



34 



Chapter 2 



extrapolation formula E(n) = Eqo + A/B° (old-style) or the two- 
point formula E(n) = Eoo + A/it" (new-style). 

3. Extrapolation of the CCSD valence correlation component of TAE 
from the "medium" and "large" basis sets (Wl) or from the "large" 
and "extra-large" basis sets (W2) employing the two-point formula 
E(n) = Eoo + A/B", where a = 3.22 (Wl) or 3 exactly (W2). 

4. Extrapolation of the contribution to TAE of the connected triple 
excitations, (T), from the valence orbitals using the same formulae 
as for CCSD, but employing instead the "small" and "medium" 
basis sets (Wl) or the "medium" and "large" basis sets (W2). 

5. The contribution of inner-shell correlation is taken as the difference 
between the CCSD(T)/MTsmall TAE with and without constrain- 
ing the inner-shell orbitals to be doubly occupied. 

6. The scalar relativistic contribution is computed as the first-order 
Darwin and mass- velocity corrections from the ACPF/MTsmall 
wave function, including inner-shell correlation. 

7. The contribution to TAE of spin-orbit splitting in the constituent 

atoms is trivially obtained from a tabulation, while for molecules 
in degenerate ground states, CISD/MTsmall spin-orbit splittings 
are computed (allowing correlation from the 2s and 2p orbitals in 
second-row atoms). 

8. The zero-point vibrational energy (Ezpv) is obtained from har- 
monic B3LYP/VTZ-I-1 frequencies scaled by 0.985 in the case of 
Wl theory. For W2 theory, anharmonic values of Ezpv from quar- 
tic force fields at the CCSD(T)/VQZ-|-1 (or comparable) level axe 
preferred; where this is not feasible, the same procedure as for Wl 
theory is followed as a "fallback solution". 

We shall now proceed to explain in detail these steps and the rationale 
behind them. 

2.1. Reference Geometry 

Near the equilibrium geometry, dependence of the energy on ge- 
ometric displacements is approximately quadratic. As a result, small 
errors in the reference geometry will insignificantly affect computed en- 
ergies, but more substantial errors (say, several hundredths of an A in 
covalent bond lengths) will compromise the reliability of a thermochem- 
ical calculation. 
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For Wl theory, we chose B3LYP [13, 14] density functional theory 
with the VTZ+1 basis set as the level of theory for the reference geom- 
etry, where the +1 suffix denotes the addition to second-row atoms of 
the highest-exponent d function from the V5Z basis set [10]. For first- 
row molecules, B3LYP/VTZ bond lengths are generahy within 0.003 A 
from experiment [15]; for second-row molecules, significant errors can be 
seen [10, 16] unless a tight d function is added to the basis set to account 
for inner polarization (see below). 

For W2 theory, we opted for CCSD(T)/VQZ-M as the level of the- 
ory for reference geometries. For geometries, the VQZ basis set is known 
to be close to the one-particle basis set limit [17, 18], while the addition 
of the inner polarization functions again takes care of inner polarization 
effects. 

2.2. The SCF Component of TAB 

For systems devoid of nondynamical correlation effects, this is the 
largest individual contribution to the molecular binding energy. Its ba- 
sis set convergence is relatively rapid, yet our discussion will be dispro- 
portionately long because a number of the "dramatis personae" that 
reappear in the remainder of the story need to be introduced here. 

For the SCF energy, we can - at least for small systems - obtain 
an exact answer by means of numerical SCF calculations. There is sub- 
stantial empirical evidence that its convergence behavior is exponential. 
Jensen studied the SCF convergence behavior of the SCF energy in H2 
[19] and and N2 [20] and found clear evidence of geometric con- 
vergence behavior in terms of both the maximum angular momentum 
in the basis set and the number of primitives within a given angular 
momentum. 

Martin and Taylor [21] compared numerical SCF energies with ex- 
trapolations from calculated SCF/A'VQZ, SCF/A'V5Z, and SCF/A'V6Z 
energies using the formula 

E(L) = Eoo + A/B^ (2.1) 

(which is equivalent to E(L) = Eqo + Aexp(— BL) originally proposed by 
Feller [22]) and, for a number of number of molecules, found discrepan- 
cies of 10 /uEh or less between the numerical and extrapolated values. 

Petersson et al. had earlier proposed [23] an alternative expression 
E(n) = Eoo + Ei^n+i A/(l + 1/2)^ in the context of the CBS methods 
developed in his group. The summation is carried out numerically in 
that paper, but in fact an elegant analytical approximation exists for 
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summations of this type: 

m=L+l ^ 

where represents the order n polygamma function [24] of x. Its 

asymptotic expansion has the leading terms 
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Hence 
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(n- l)(L + l)°-i 



(2.4) 



This suggests the simple extrapolation formula E(n) = Eoo + A/n^, i.e. 
Eoo = E(n) + ^n°^nl^)5Zi'' ; whcrc n is identified with the "n-tuple ze- 
taness" of the Dunning correlation consistent VnZ basis sets. (For hy- 
drogen and helium, n equals the maximum angular momentum plus one; 
for the main group elements it is equal to the maximum angular momen- 
tum). While an argumentation in favor of the Petersson-type formula 
can be built on the convergence behavior of triplet-coupled pairs, neither 
this formula nor the geometric one have a solid formal basis. 

Fortunately, convergence on the SCF component of atomization en- 
ergies is even more rapid than for the total energies; Martin and Taylor 
found for 14 first-row molecules [25] that differences between unextrapo- 
lated SCF/A'V5Z, geometrical extrapolations from SCF/A'V{T,Q,5}Z, 
and A -|- B/L"' extrapolations from SCF/A'V{Q,5}Z results are on the 
order of 0.01 kcal/mol. For the method that we designated W2, which 
uses this basis set sequence, the choice of SCF extrapolation method is 
largely a non-issue. For the method that we designated Wl, however, 
the geometric formula entails the use of results from the comparatively 
small A'VDZ basis set, which compromises the reliability of extrapolated 
SCF limits in systems with slow basis set convergence. In some cases 
(see Table 1 in Ref. 26), these can lead to errors of several kcal/mol. 
In addition, the two-point A -|- B/L^ formula has the elegant property 
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that it becomes immaterial whether the extrapolation is carried out on 
a reaction energy or on the individual absolute energies. 

In the original W1/W2 paper [1], we opted for the geometric for- 
mula in view of the observed geometric convergence behavior. In a 
subsequent validation study [26] on a much wider variety of systems, we 
however found the two-point formula to be much more reliable, and we 
have adopted it henceforth. 

Finally, an issue that arises with second-row systems should be ad- 
dressed. It was first noted by Bauschlicher and Partridge [27] that the 
atomization energy of SO2 is exceedingly sensitive to the presence of 
high-exponent d and / functions in the basis set. This phenomenon was 
ascribed to hypervalence; Martin and Uzan [10], however, found that the 
same phenomenon exists in systems that cannot be considered hyperva- 
lent by the wildest stretch of the imagination, like AlF. In addition, it 
was found [11, 16] that properties other than the energy are affected as 
well, with (e.g. in SO2 [11] and SO3 [16]) errors of up to 50 cm~^ in 
harmonic frequencies and hundredths of A in bond lengths unless high- 
exponent d and / functions (termed "inner polarization functions" in 
Ref. 11 are added to the basis set. 

We should note that inner polarization is strictly an SCF-level ef- 
fect: while, for instance, switching from an A'VDZ to an A'VDZ+2d 
basis set affects the computed atomization energy of SO3 by as much as 
40 kcal/mol (!), almost all of this effect is seen in the SCF component of 
the TAE [28]. In fact, we have recently found [29] that the effect persists 
if the (ls,2s,2p) orbitals on the second-row atom are all replaced by a 
pseudopotential. What is really getting "polarized" here is the inner 
part of the valence orbitals, which requires polarizations functions that 
are much "tighter" (higher-exponent) than those required for the outer 
part of the valence orbital. The fact that these inner polarization func- 
tions are in the same exponent range as the d and / functions required 
for correlation out of the (2s, 2p) orbitals is merely coincidental; the "in- 
ner polarization" effect has nothing to do with correlation, let alone with 
inner-shell correlation. 

After extensive numerical experimentation, we have decided [1] on 
the sequence of basis sets noted above: "small" A'VDZ-|-2d, "medium" 
A'VTZ+2dlf, "large" A'VQZ+2dlf, and "extra large" A'V5Z+2dlf. 

As the present review was being finalized for publication, we re- 
ceived a preprint by Dunning et al. [12] where new cc-pV(n-|-d)Z basis 
sets are proposed for the second-row atoms. These basis sets do have just 
an added tight d function (hence the acronym) and no tight / functions, 
but the remaining d functions in the underlying cc-pVnZ basis set are in 
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addition reoptimized. We are currently investigating tiieir performance 
in Wl and W2-type schemes. 

2.3. The CCSD Valence Correlation Component of TAE 

The valence correlation component of TAE is the only one that 
can rival the SCF component in importance. As is well known by now 
(and is a logical consequence of the structure of the exact nonrelativistic 
Born-Oppenheimer Hamiltonian on one hand, and the use of a Hartree- 
Fock reference wavefunction on the other hand), molecular correlation 
energies tend to be dominated by double excitations and disconnected 
products thereof. Single excitation energies become important only in 
systems with appreciable nondynamical correlation. Nonetheless, since 
the number of single-excitation amplitudes is so small compared to the 
double-excitation amplitudes, there is no point in treating them sepa- 
rately. 

For all intents and purposes then, we are concerned here with the 
CCSD (coupled cluster with all single and double substitutions [30]) 
correlation energy. Its convergence is excruciatingly slow: Schwartz [31] 
showed as early as 1963 that the increments of successive angular mo- 
menta / to the second-order correlation energy of helium-like atoms con- 
verge as 

AE(/) = A/(l + 1/2)4 ^ g/^^ ^ -L/2)6 + . . . . (2.5) 

His conclusions were generalized to other methods and general pair cor- 
relation energies by Hill [32] and by Kutzelnigg and Morgan [33]. 

This clearly spells a rather bleak picture of basis set convergence. 
Indeed, Martin [17] showed in 1994 that while convergence of a bond 
energies appeared in sight at the CCSD{T) / spdfg level, this did not 
yet appear to be the case for tt bond energies. This earlier study was 
extended in 1996 [34] to basis sets of spdfgh quality: somewhat depress- 
ingly, residual errors in the binding energies as high as 2 kcal/mol were 
still found for small systems. 

However, rather than "knuckling under" to Eq.(2.5) at this stage, 
we might instead exploit it for an extrapolation formula. Martin [34] 
suggested a three-point extrapolation of the form A + B/(n + l/2)'~' 
(where n is identified with the cardinal number of the cc-pVnZ basis 
set), and obtained dramatically improved computed total atomization 
energies. A slight further improvement was achieved if the SCF and 
valence correlation energies - which have fundamentally different con- 
vergence behaviors - are extrapolated separately using the respective 
appropriate formulae [25]. 
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The denominator shift of 1/2 was chosen as a compromise between 
the situation for hydrogen and heUum (where n = 1 + 1 for the cc-pVnZ 
basis set) and main-group elements (where n = 1). As is immediately 
obvious upon series expansion, there is considerable coupling between 
the denominator shift and the exponent. As a result, the three-point 
extrapolation generally leads to exponents well in excess of three [34] . 

Halkicr et al. [35] found that the simple expression E(L) = Eqo + 
A/L^ [i.e. Eoo = E(L) + ^l/^i^i)3Zi^ ] works at least equally well. In 
view of its simplicity and the fact that no results with the questionable 
A'VDZ basis set are required, we have adopted this simple formula for 
extrapolation of the CCSD valence correlation energy in Wl and W2 
theories. 

For the smaller basis sets used in Wl theory, the regime where the 
leading Eqo + A/L^ term dominates convergence behavior has not yet 
been reached, and using the formula in its unmodified form leads to 
overestimated (in absolute value) CCSD limits. One unelegant solution 
would be the use of three-term extrapolations like Eoo + A/L^ -|- B/L^, 
but in light of the poor quality of the VDZ basis set this is a most 
unsatisfactory alternative. Another alternative is the use of a two-point 
extrapolation Eqo + A/L", in which a is a fixed empirical parameter. 
By minimizing the deviation from the W2 CCSD limit for the so-called 
W2-1 set of 28 molecules (vide infra), we determined a = 3.22, which is 
the value used in Wl theory and its variants. 

2.4. Connected Triple Excitations: the (T) Valence 
Correlation Component of TAE 

It has been well known for some time (e.g. [36]) that the next com- 
ponent in importance is that of connected triple excitations. By far 
the most cost-effective way of estimating them has been the quasiper- 
turbative approach known as CCSD(T) introduced by Raghavachari et 
al. [37], in which the fourth-order and fifth-order perturbation theory 
expressions for the most important terms are used with the converged 
CCSD amplitudes for the first-order wavefunction. This account for 
substantial fractions of the higher-order contributions; a very recent de- 
tailed analysis by Cremer and He [38] suggests that 87, 80, and 72 %, 
respectively, of the sixth-, seventh-, and eighth-order terms appearing in 
the much more expensive CCSDT-la method are included implicitly in 
CCSD(T). 

Nevertheless, the formidable n'^N^ (with n the number of electrons 
and N the number of basis functions) cost scaling of the CCSD(T) 
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method creates a substantial barrier to applications of methods that 
require A'V5Z+2dlf basis sets. However, two things should be kept in 
mind. First of all, the (T) component of TAE is a small fraction of 
the CCSD component, and hence a larger relative error can be toler- 
ated. Secondly, evidence exists [39] that basis set convergence of the (T) 
contribution is substantially more rapid than that of the CCSD energy. 

As a result, one may justifiably extrapolate the (T) contribution 
from smaller basis sets than its CCSD counterpart: in Wl theory, we 
extrapolate from the "small" and "medium" basis sets, and in W2 theory 
from the "medium" and "large" basis sets. This means that the most 
extensive basis sets in the calculations, namely "large" in Wl theory 
and "extra large" in W2 theory only require CCSD calculations, which 
are both much less expensive than CCSD(T) and much more amenable 
to direct algorithms such as those described in Refs. 40-41. 

2.5. The Inner-Shell Correlation Component of TAE 

Inner-shell correlation is a substantial part of the absolute corre- 
lation energy even for late first-row systems; for second-row systems, 
it in fact rivals the absolute valence correlation energy in importance. 
However, its relative contribution to molccTilar TAEs is fairly small: 
in benzene, for instance, it amounts to less than 0.7 % of the TAE. 
Even so, at 7 kcal/mol, its contribution is important by any reasonable 
thermochemical standard. By the same token, a 1 % relative error in a 
7 kcal/mol contribution is tolerable even by benchmark thermochemistry 
standards, while the same relative error in a 300 kcal/mol contribution 
would be unacceptable even by the "chemical accuracy" standards. 

In addition, for thermochemical purposes we are primarily inter- 
ested in the core-valence correlation, since we can reasonably expect the 
core-core contributions to largely cancel between the molecule and its 
constituent atoms. (The partitioning between core-core correlation - 
involving excitations only from inner-shell orbitals - and core-valence 
correlation - involving simultaneous excitations from valence and inner- 
shell orbitals - was first proposed by Bauschlicher, Langhoff, and Taylor 
[42]). 

For these reasons, we feel justified in treating the inner-shell cor- 
relation contribution to TAE as a separate contribution, rather than 
together with the valence correlation. There are substantial cost advan- 
tages to this: rather than having to carry out very elaborate all-clcctrons- 
correlated CCSD(T) calculations in basis sets near saturation for both 
valence and inner-shell correlation, we can limit these costly calculations 
to a basis set that is primarily saturated for inner-shell correlation. 
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Inner-shell correlation contributions for the W2-1 set were studied 
in some detail in the original W1/W2 paper, while subsequently, Martin, 
Sundermann, Fast, and Truhlar (MSFT) [43] studied inner-shell correla- 
tion contributions to TAE for 125 molecules spanning the first two rows 
of the periodic table. The following conclusions can be drawn from these 
two studies: (a) the use of the CCSD(T) electron correlation method is 
absolutely required for reliable contributions: the use of MP2 or CCSD 
can lead to underestimates in the order of 50 %; (b) the smallest basis set 
which gives acceptable agreement with near-basis set limit contributions 
is the MTsmall basis set, which is a completely dccontracted cc-pVTZ 
basis set with (2dl/) additional high-exponent correlation functions; (c) 
the effect of including even higher excitations in the correlation treat- 
ment is insignificant. 

A tentative explanation for the importance of connected triple exci- 
tations for the inner-shell contribution to TAE can be found in the need 
to account for simultaneously correlating a valence orbital and relax- 
ing an inner-shell orbital, or conversely, requiring a double and a single 
excitation simultaneously. 

In principle, one could contract at least the few innermost s primitives 
and reduce the basis set further. By leaving the basis set completely un- 
contracted, however, we can recycle the integrals and SCF wavefunction 
for the next step of the calculation. 

Finally, it is generally advised not to correlate the very deep-lying 
(Is) orbitals on second-row elements, as the MTsmall basis set does 
not have angular correlation functions in the required exponent range, 
and in addition the orbitals concerned are in the same energy range as 
the {2s, 2p) orbitals in third-row main group elements, for which being 
able to take a [Ne] core out of the correlation problem does result in 
appreciable CPU time savings. 

2.6. Scalar Relativistic Correction 

The importance of scalar relativistic effects for compounds of tran- 
sition metals and/or heavy main group elements is well established by 
now [44]. Somewhat surprisingly (at first sight), they may have non- 
trivial contributions to the TAE of first-row and second-row systems as 
well, in particular if several polar bonds to a group VI or VII element are 
involved. For instance, in BF3, SO3, and SiF4, scalar relativistic eff'ects 
reduce TAE by 0.7, 1.2, and 1.9 kcal/mol, respectively - quantities which 
clearly matter even if only "chemical accuracy" is sought. Likewise, in 
a benchmark study on the electron affinities of the first-and second-row 
atoms [45] - where we were able to reproduce the experimental values to 
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within 0.001 eV on average - we saw that neglect of the scalar relativistic 
contributions increased mean deviation from experiment by more than 
an order of magnitude. 

Perhaps the simplest and most cost-effective way of treating rela- 
tivistic contributions in an all-electron framework is the first-order per- 
turbation theory of the one-electron Darwin and mass-velocity opera- 
tors [46, 47]. For variational wavefunctions, these contributions can be 
evaluated very efficiently as expectation values of one-electron operators. 

It has been found repeatedly [1, 43, 45] that scalar relativistic con- 
tributions are overestimated by about 20-25 % in absolute value at the 
SCF level. Hence inclusion of electron correlation is essential: we found 
the ACPF method (which is both variational and approximately size 
extensive) to be an excellent compromise between quality and cost. It 
is reasonable to suppose that for a property that becomes more impor- 
tant as one approaches the nucleus, one wants maximum flexibility of 
the wavef unction near the nucleus as well as correlation of all electrons; 
thus we finally opted for ACPF/MTsmall as our approach of choice. 
Typically the cost of the scalar relativistic step is a fairly small fraction 
of that of the core correlation step, since only n^N^ scaling is involved 
in the ACPF calculations. 

Bauschlicher [48] compared a number of approximate approaches 
for scalar relativistic effects to Douglas-KroU quasirelativistic CCSD(T) 
calculations. He found that the ACPF/MTsmall level of theory faith- 
fully reproduces his more rigorous calculations, while the use of non-size 
extensive approaches like CISD leads to serious errors. For third-row 
main group systems, studies by the same author [49] indicate that more 
rigorous approaches may be in order. 

2.7. Spin-Orbit Coupling 

The other relativistic effect entirely neglected so far is the spin-orbit 
coupling. For systems in nondegenerate states, the only first-order con- 
tribution to TAE comes from the fine structures in the corresponding 
atoms. Their effects can trivially be obtained from the observed elec- 
tronic spectra, and hence the computational cost of this correction is 
fundamentally zero. 

For systems in degenerate states, first-order corrections may need 
to be computed. In our work [26] we found that this significantly re- 
duced the mean absolute error for the G2-1 and G2-2 test sets for ion- 
ization potentials and electron affinities, in no small part due to the 
preponderance of atoms and linear molecules in these sets. We found 
that CISD /MTsmall generally yields quite satisfactory spin-orbit correc- 
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tions, but that it is advisable to correlate the (2s, 2p)-like electrons in the 
second-row elements. For the halogen atoms, convergence of these con- 
tributions with the level of theory was studied in some detail by Nicklass 
et al. [50]. These authors came to fundamentally the same conclusions. 

2.8. The Zero-Point Vibrational Energy 

It has been noted repeatedly (e.g. [51, 52, 53]) that one-half the 
sum of the harmonic frequencies, ^ J2i di (with di representing the 
degeneracy of mode i) generally leads to an overestimate of the Ezpy, 
and that one-half the sum of the fundamentals, | J2i di, generally leads 
to an underestimate. In fact, it is easily shown that the average of these 
two estimates is a fairly good approximation to the anharmonic Ezpy. 

For the sake of convenience, we shall restrict ourselves to the case 
of symmetric tops, asymmetric tops being a special case thereof with no 
degenerate modes. Including only up to first-order anharmonicities Xy, 
and excluding the small constant Eq, the vibrational energy is given as 

G(n, 1) = ^ a;^ (Hi + |) + ^ Xy (n^ + |)(nj + |) + S(l) , (2.6) 

i i<j 

in which S is the splitting term involving the angular momenta I of 
the degenerate vibrations, and Ui represents the vibrational quantum 
number for mode i. It trivially follows that the zero-point energy Ezpv 
is given by 

EzPV = E^i| + EXiJ^. (2.7) 

i i<j 

In addition we find that [introducing the shorthand G(n, I)° = G(n, 1) — 

G(0))] 

G(n,l)0 = ^a;,ni + EXy[(n, + |)(nj + |)-^]+S(l) 

= E + E [^i^j + + nj |] + S(l) 

= E + E + '^i) 

i i 

+ iEXij[ninj+ni| + nj|] + S(l). (2.8) 
Now assume only n^ is nonzero, then 
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G(nk,lk)° = Wk nk + Xkk nk (nk + dk) 

+^E(Xik + Xki)^ + S(lk) 

= Wk nk + Xkk nk (nk + dk) + ^ Xik nt ^ + Gkk Ik • 

(2.9) 

It then follows that 

k k k k i^^k 

degen. , 

dk ^ ,2 



yGkkl] 



k 



= E-4 + EXkkf + EXkk| + Ex.k^ 
k k k k>i 

degen. ^ 

+ E -W^^^^k 
k ^ 

= E'*Y+Ex>^^+Ex.u|+ e'yG.,ij. 

k k>i k k 

(2.10) 

That is, 

EK+Wk)^ = E^k Y+E^ik^+E^kk^ 
k k k>i k 



degen. 

+ E 1" Gkk Ik 
k 



^ Gkk Ik 

degen. 



= Ezpv + E Xkk ^ + E T ^'^'^ 1^ ' 

k k 

in which the Gkk are the diagonal /-coupling constants. The last term is 
generally negligible. If so desired, the term involving the diagonal anhar- 
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monicity constants can be estimated from anharmonicities in diatomic 
molecules. 

The common practice of scaling computed vibrational frequencies 
for comparison with experimental fundamentals attempts at approxi- 
mately addressing two issues: (a) the imperfections of the theoretical 
model for the harmonic frequency (which for CCSD(T), or even B3LYP, 
in sufficiently large basis sets is basically unnecessary); and (b) the an- 
harmonic contribution to the fundamental. The above analysis suggests 
that a scaling factor that is intermediate between those used for re- 
producing harmonics and fundamentals would be the most appropriate 
for anharmonicities. In the original Wl paper [1], we considered the 
essentially exact anharmonic values of Ezpv of the 28 W2-1 molecules 
(determined from experiment or large basis set CCSD(T) quartic force 
field calculations, e.g. [54] and the references therein) and found the ap- 
propriate scaling factor for B3LYP/VTZ+1 harmonic frequencies to be 
0.985. The largest individual deviation between the scaled harmonic and 
exact anharmonic values of Ezpv was only 0.3 kcal/mol (for PH3). 

Some of the above remarks are probably best illustrated by an ex- 
ample. For benzene, a B3LYP/TZ2P quartic force field was computed by 
Handy and coworkers [55]. From the published anharmonicity constants 
(specifically, the set deperturbed for Fermi resonances closer than 100 
cm~^), we obtain an anharmonic Ezpv of 62.04 kcal/mol. For compari- 
son, one-half the sum of the harmonics comes out 0.9 kcal/mol too high 
at 62.96 kcal/mol, and one-half the sum of the fundamentals comes out 
1 kcal/mol too low at 60.98 kcal/mol. The average of both values, 61.97 
kcal/mol, is in excellent agreement with the anharmonic value, while 
the Wl estimate accidentally agrees to within two decimal places with 
the B3LYP/TZ2P anharmonic value. From the best available computed 
harmonic frequencies [56] and the best available experimental fundamen- 
tals [55], we obtain Ezpv = 62.01 kcal/mol or, after correction for the 
difference between this estimate and the true anharmonic Ezpv at the 
B3LYP/TZ2P level, 0.07 kcal/mol, we find Ezpv = 62.08 kcal/mol as 
possibly the best estimate. (Note that HF/6-31G* harmonic frequen- 
cies scaled by 0.8929, as used in G2 and G3 theories, yields only 60.33 
kcal/mol. In this accuracy range, one certainly cannot indulge in a 1.7 
kcal/mol underestimate in the zero-point energy!) 

In a recent benchmark study [57] on the CH2=NH molecule, we ex- 
phcitly computed a CCSD(T)/VTZ quartic force field at great expense 
(the low symmetry necessitated the computation of 2241 energy points 
in Cg symmetry and 460 additional points in Ci symmetry). The result- 
ing anharmonic Ezpv, 24.69 kcal/mol, is only 0.10 kcal/mol above the 
scaled B3LYP/VTZ estimate, 24.59 kcal/mol. At least for fairly rigid 
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molecules, it appears hard to justify the additional expense and effort 
for the anharmonic force field unless it were required anyway for other 
purposes. 

If we use B3LYP/VTZ+1 harmonics scaled by 0.985 for the Ezpv 
rather than the actual anharmonic values, mean absolute error at the 
Wl level deteriorates from 0.37 to 0.40 kcal/mol, which most users would 
regard as insignificant. At the W2 level, however, we see a somewhat 
more noticeable degradation from 0.23 to 0.30 kcal/mol — if kJ/mol 
accuracy is required, literally "every little bit counts". If one is primar- 
ily concerned with keeping the maximum absolute error down, rather 
than getting sub-kJ/mol accuracy for individual molecules, the use of 
B3LYP/VTZ-I-1 harmonic values of Ezpv scaled by 0.985 is an accept- 
able " fallback solution" . The same would appear to be true for thermo- 
chemical properties to which the Ezpv contribution is smaller than for 
the TAE (e.g. ionization potentials, electron affinities, proton affinities, 
and the like). 

3. PERFORMANCE OF Wl AND W2 THEORIES 

A reliable assessment of the performance of a method in the kJ/mol 
accuracy range is, by its very nature, only possible where experimental 
data are themselves known to this accuracy. 

3.1. Atomization Energies (the W2-1 Set) 

In the original W1/W2 paper [1], we selected a set of 28 first-and 
second-row molecules (which we shall call the W2-1 set) containing at 
most three nonhydrogen atoms for which (a) the experimental total at- 
omization energies Do are available to the highest possible accuracy 
(preferably 0.1 kcal/mol); (b) no strong nondynamical correlation effects 
exist that would hinder the applicability of single-reference electron cor- 
relation methods; (c) near-exact anharmonic values of Ezpv are available 
from either experimental anharmonicity constants or highly accurate ab 
initio anharmonic force fields. 

Results using Wl and W2 theories are shown in Table 2.1. For W2 
theory we find a mean absolute deviation (MAD) of 0.23 kcal/mol, which 
further drops to 0.18 kcal/mol when the NO, O2, and F2 molecules are 
deleted (all of which have mild nondynamical correlation in common). 
Our largest deviation is 0.70 kcal/mol. We can hence state that W2 
meets our design goals. 
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Table 2.1 Comparison of W2 and Wl theories, and their variants for the evaluation 

of TAEn (kcal/niol) for the W2-1 tost set. 



Experimental" Deviation (experiment — theory) 



Species 
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± (uncert.) 


W2'' 
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Wlh'^ 


Wlc 


TT 

H2 


1 no o'7 


0.00 


A AC 

-O.Oo 


-0.04 




A A'? 

-0.0/ 
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A CO 


Mr 
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-U.4/ 
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0.23 


A AO 

-O.Oo 
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-0.14 
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All 

-O.ii 


A OT 

-0.3/ 


KjKJ 


ZOD.iD 


A 1 

0.12 


A 1 

0.12 


A 1 

0.12 


A 1 /I 

0.14 


A AO 

-O.Oo 


A Af? 

-O.Oo 


A /I 1 


INU 




A AQ 


n AT 


A CC/I 

U.o4 




A K^i 




A QQ 
U.OO 


Ob 


1 fin A 1 


A OQ 


A QA 

U. jU 


A Q 1 

U. Ji 


A QO 


A 

V.I / 


A AC 

u.yo 


A /I 

U.4D 


SO 


123.58 


0.04 


-0.02 


-0.04 




0.52 




0.57 


HCl 


102.24 


0.02 


-0.04 


-0.14 




-0.15 




-0.17 


GIF 


60.36 


0.01 


0.09 


0.08 




0.15 




0.03 


CI2 


57.18 


0.00 


-0.20 


-0.24 




0.60 




0.50 


HNO 


196.85 


0.06 


0.38 


0.37 




0.20 




-0.03 


CO2 


381.91 


0.06 


0.14 


0.13 


0.10 


-0.37 


-0.34 


-0.37 


H2O 


219.35 


0.12 


-0.04 


-0.14 




-0.55 




-0.58 


H2S 


173.15 


0.12 


-0.37 


-0.49 




-0.47 




-0.51 


HOCl 


156.61 


0.12 


-0.16 


-0.24 




-0.18 




-0.40 


OCS 


328.53 


0.48 


-0.19 


-0.21 


-0.21 


-0.01 


0.11 


0.10 


CICN 


279.20 


0.48 


0.41 


0.52 


0.78 


0.78 


0.91 


0.82 


SO2 


253.92 


0.08 


-0.31 


-0.33 




0.63 




0.81 


CH3 


289.00 


0.10 


-0.21 


-0.32 


-0.38 


-0.53 


-0.51 


-0.39 


NH3 


276.73 


0.13 


0.13 


-0.03 




-0.28 




-0.17 


PH3 


227.13 


0.41 


-0.01 


0.28 




0.23 




0.05 


C2H2 


388.90 


0.24 


0.42 


0.64 


0.53 


0.26 


0.51 


0.29 


CH2O 


357.25 


0.12 


-0.27 


-0.40 


-0.35 


-0.59 


-0.56 


-0.76 


CH4 


392.51 


0.14 


-0.11 


-0.13 


-0.19 


-0.35 


-0.47 


-0.34 


C2H4 


531.91 


0.17 


-0.19 


-0.31 


-0.32 


-0.63 


-0.41 


-0.72 


Mean Absolute Deviation 
Max. Absolute Deviation 


0.23 
0.64 


0.29 
0.78 


0.30 
0.78 


0.40 
0.78 


0.41 
0.95 


0.39 
0.82 



" See [1] for experimental references. 

^ Values of Ezpv derived from anharmonic vibrational frequencies. See Ref. 1 
for details. 

Values of Ezpv derived from B3LYP/VTZ-I-1 harmonic vibrational frequencies 
scaled by 0.985. Same remark applies to W2h, Wl, Wlh and Wlc data given. 
For systems where W2h and Wlh are equivalent to W2 and Wl, respectively, 
entries have been left blank. 
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For Wl theory, MAD is increased to 0.37 kcal/mol (old SCF extrap- 
olation) or 0.40 kcal/mol (new SCF extrapolation), with the maximum 
error being 0.78 kcal/mol. This should be compared with a MAD of 1.25 
kcal/mol for G2 theory, 0.89 kcal/mol for G3 theory, 0.88 kcal/mol for 
CBS-Q, and 0.61 kcal/mol for CBS-QB3, and the much higher maximum 
errors of these methods of 4.90 kcal/mol (SO2), 3.80 kcal/mol (SO2), 
3.10 kcal/mol (OCS), and 1.90 kcal/mol (OCS), respectively While we 
would prefer to use W2 theory for no-nonsense benchmarking if at all 
possible, Wl theory still seems to offer great advantages over the other 
techniques. 

3.2. Electron Affinities (the G2/97 Set) 

Some representative results can be found in Tabic 2.2. For the 
G2-1 set of electron affinities, Wl theory has a mean absolute error of 
0.016 eV [26] . Not unexpectedly - given the slow basis set convergence 
of electron affinities the extra effort invested in W2 theory pays off 
with a further reduction of the mean absolute error to 0.012 cV. Accu- 
racy appears to be limited principally by imperfections in the CCSD(T) 
method: for the atoms B-F and Al-Cl, using even larger basis sets we 
achieve 0.009 eV at the CCSD(T) level, which decreases to 0.001 eV if 
approximate full CI energies are used. 

Normally Wl theory does not involve diffuse functions on H, Li, Na, 
Be, and Mg; not surprisingly, this leads to very poor electron affinities 
for Li and Na. Upon switching to Wlaug (i.e. using augmented basis 
sets on all elements), perfect agreement with experiment is obtained. 
Within the G2-2 set, substantial discrepancies between Wl theory and 
experiment are found for O3 and CH2NC, both of which are systems 
with pronounced multireference character. (The same remark applies 
to a lesser extent to FO.) Scalar relativistic effects almost invariably 
decrease the electron affinity. Neglect of spin-orbit splitting leads to 
significant deterioration in MAD. 

3.3. Ionization Potentials (the G2/97 Set) 

Some representative results can again be found in Table 2.2. At the 
Wl level, the G2-1 ionization potentials are reproduced with a MAD of 
only 0.013 eV [26]. No further improvement is seen at the W2 level 
for this property. Note that if the B3LYP/VTZ geometry for CH4' is 
employed, a serious error is seen for IP(CH4) which disappears when a 
CCSD(T) /VTZ reference geometry is used instead. (Only BH & HLYP 
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Table 2.2 Comparison of W2 and Wl theories, and their variants for the evaluation 
of electron affinity and ionization potential (eV) for selected species from G2-1 test 
set. 



Experimental'* Deviation (experiment — theory) 

Species 



Value ± (uncert.) W2 W2h Wl Wlh 
Electron Affinities 



c 


1.2629 


0.0003 


0.007 


0.041 


0.011 


0.210 


Si 


1.38946 


0.00006 


0.010 


0.081 


0.011 


0.060 


CH 


1.238 


0.0078 


0.029 


0.060 


0.032 


0.248 


CH2 


0.652 


0.006 


0.002 


0.042 


0.011 


0.236 


CH3 


0.08 


0.03 


0.034 


0.088 


0.051 


0.284 


SiH 


1.2771 


0.0087 


0.031 


0.094 


0.034 


0.084 


SiH2 


1.123 


0.022 


0.039 


0.088 


0.043 


0.087 


SiHg 


1.406 


0.014 


0.011 


0.033 


0.019 


0.044 


CN 


3.862 


0.005 


-0.026 


-0.036 


-0.031 


-0.023 



louizal ion Pt)l cniials 



B 


8.29802 


0.00002 


0.007 


0.009 


0.019 


0.020 


C 


11.2603 


0.0001 


0.010 


-0.002 


0.012 


0.012 


Al 


5.986 


0.001 


0.023 


0.022 


0.024 


0.025 


Si 


8.15166 


0.00003 


0.018 


-0.004 


0.021 


0.022 


CH4 (b) 


12.61 


0.01 


-0.033 


-0.035 


-0.032 


-0.035 


SiH4 


11 


0.02 


0.006 


0.006 


-0.005 


-0.005 


C2H2 


11.403 


0.0003 


-0.004 


-0.004 


-0.001 


0.005 


C2H4 


10.5138 


0.0006 


-0.001 


0.001 


-0.005 


0.000 


CO 


14.0142 


0.0003 


-0.014 


-0.013 


-0.009 


-0.008 


cs 


11.33 


0.01 


-0.017 


-0.018 


-0.017 


-0.016 



See Ref. 26 for experimental references. 
^ CCSD(T)/VTZ geometry. B3LYP/VTZ optimization erroneously yields Dzd 
structure for cation rather than correct Civ symmetry. See Ref. 26 for details. 



[58] and mPWlK [59] correctly predict a C2V structure for CHJ; other 
exchange-correlation functionals wrongly lead to a D2 structure). 

Inner-shell correlation contributions arc found to be somewhat more 
important for ionization potentials than for electron affinities, which is 
understandable in terms of the creation of a valence 'hole' by ionization 
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into which inner-shell electrons can be excited. Again, inclusion of spin- 
orbit splitting is worthwhile. 

3.4. Heats of Formation (the G2/97 Set) 

A detailed discussion and a table can be found in Ref. 26. First 
of all, we note that the mean uncertainty for the experimental values 
in the G2-1 set is itself 0.6 kcal/mol. MAD values for Wl and W2 
theory stand at 0.6 and 0.5 kcal/mol, respectively, suggesting that these 
theoretical methods have a reliability comparable to the experimental 
data themselves. 

For a subset of 27 G2-2 molecules with fairly small experimental 
uncertainties, Wl theory had MAD of 0.7 kcal/mol, compared to the 
average experimental uncertainty of 0.4 kcal/mol. Some systems exhibit 
deviations from experiment in excess of 1 kcal/mol: in the cases of BF3 
and CF4, very slow basis set convergence is responsible, and W2 calcula- 
tions in fact remove nearly all remaining disagreement with experiment 
for the latter system. (The best available value for BF3 is itself a the- 
oretical one, so a comparison would involve circular reasoning.) Other 
molecules (NO2 and CINO) suffer from severe multireference effects. 

3.5. Proton Affinities 

For proton affinities, Wl theory can basically be considered con- 
verged [26]. The W2 computed values are barely different from their 
Wl counterparts, and the latter's MAD of 0.43 kcal/mol is well below 
the about 1 kcal/mol uncertainty in the experimental values. Wl theory 
would appear to be the tool of choice for the generation of benchmark 
proton affinity data for calibration of more approximate approaches. 



4. VARIANTS AND SIMPLIFICATIONS 

4.1. Wl' Theory 

It was noted that the original Wl theory (old-style SCF extrapola- 
tion) performed considerably more poorly for second-row than for first- 
row species. This was ascribed to the lack of balance in the basis sets for 
second-row atoms used in the SCF and valence correlation steps of Wl; 
in particular, the A'VTZ+2dlf basis set contains as many "tight" d and 
/ functions as regular ones, which would appear to be a bit top-heavy. 
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It was proposed to replace the A'VTZ+2dlf basis set by A'VTZ+2d, a 
conclusion borne out by calculations on the SO3 molecule [28], which 
suffers from extreme inner polarization effects and as such provides a 
good " proving ground" . 

Compared to its prototype, the modification (the so-called Wl' 
theory) did appear to yield improved results for second-row molecules. 
However, in the W1/W2 validation study [26] we found this to be an 
artifact of the exaggerated sensitivity of the (old-style) 3-point geometric 
SCF extrapolation. Use of the new-style Eoo+A/L^ extrapolation largely 
eliminates both the problem and the difference between Wl and Wl' 
theory. 

4.2. Wlh and W2h Theories 

While the need for diffuse-function augmented basis sets for highly 
electronegative elements is well established (e.g. [34]), it could be ar- 
gued that they are not really required on group III and IV elements. 
For organic-type molecules in particular, this would result in significant 
savings. 

We define here Wlh and W2h theories, respectively, as the modifi- 
cations of Wl theory for which AVnZ basis sets arc only used on elements 
of groups V, VI, VII, and VIII, but regular VnZ basis sets on groups 
I, II, III, and IV. (The "h" stands for "heteroatom", as we originally 
investigated this for organic molecules.) For the purpose of the present 
paper, we have repeated the validation calculations described in the pre- 
vious section for Wlh and W2h theories. (For about half of the systems, 
Wl and Wlh are trivially equivalent.) Some representative results can 
be found in Table 2.1 for atomization energies/heats of formation, and 
in Table 2.2 for ionization potentials and electron affinities. 

For the heats of formation in the G2-1 set, the largest difference 
between Wl and Wlh theory is 0.3 kcal/mol for Si2; the average differ- 
ence is less than 0.1 kcal/mol. For some of the systems in the G2-2 set, 
however, differences are more pronounced, e.g. 0.6 kcal/mol for CF4 and 
0.8 kcal/mol for benzene. (Note that the benzene calculation reported as 
an example application in the original Wl paper [1] is in fact a Wlh cal- 
culation: the remaining small difference between that reference and the 
present work is due to the different SCF extrapolations used.) For the 
G2-1 heats of formation, W2h and W2 are essentially indistinguishable 
in quality, as could reasonably be expected. 

For the G2-1 ionization potentials, the largest differences are 0.005 
and 0.006 eV, respectively, for ethylene and acetylene. Differences in 
the G2-2 set are likewise small, although Si2H2 (0.009 eV) and CH3OF 
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(0.024 eV) stand out. Clearly Wlh is of a quality comparable to Wl for 
ionization potentials, and we recommend it as a moderately inexpensive 
high-accuracy method for this property. (As noted before, W2 does not 
represent an improvement over Wl for ionization potentials, and the 
same goes for W2h theory.) 

For electron affinities, the differences between Wlh and Wl are very 
pronounced, and become (as expected) particularly large (e.g. 0.284 eV 
in CH3) for species where none of the atoms carry diffuse functions in 
Wlh theory. The differences between W2 and W2h theory are still quite 
sizable, and in fact agreement with experiment for W2h is inferior to that 
for the less expensive Wl method. In summary, we do not recommend 
Wlh or W2h for electron affinities. 

4.3. A Bond-Equivcdent Model for Inner-Shell Correlation 

In a pilot Wlh calculation on benzene [1], it was found that 85 % of 
the CPU time was spent on the inner-shell correlation step. Given that 
this contribution is about 0.5 % of the TAE of benzene, the CPU time 
proportion appears to be lopsided to say the least. On the other hand, a 
contribution of 7 kcal/mol clearly cannot be neglected by any reasonable 
standard. However, inner-shell correlation is by its very nature a much 
more local phenomenon than valence correlation, and a relative error 
of a few percent in such a small contribution is more tolerable than a 
corresponding error in the major contributions, Martin, Sundermann, 
Fast and Truhlar (MSFT) [43] investigated the applicability of a bond 
equivalent model. 

We started by generating a data base of inner-shell correlation con- 
tributions for some 130 molecules that cover the first two rows of the pe- 
riodic table. In order to reduce the number of parameters in the model to 
be fitted, we introduced a Mulliken-type approximation for the parame- 
ters Dab ~ (Da+Db)/2. Furthermore we did retain different parameters 
for single and multiple bonds, but assumed Da=b ~ (3/2)Da=b- 

The model (which requires essentially no CPU time) was found 
to work very satisfactorily; its performance for the W2-1 set can be 
seen in Table 2.3. Somewhat to our surprise, we found that the same 
model performs reasonably well when applied to the scalar relativistic 
contributions, albeit with larger individual deviations. 

It was recently suggested by Nicklass and Peterson [60] that the 
use of core polarization potentials (CPPs) [61] could be an inexpensive 
and effective way to account for the effects of inner shell correlation. 
The great potential advantage of this indeed rather inexpensive method 
over the MSFT bond-equivalent model is that it does not depend on 
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Table 2.3 Comparison of core correlation contributions to TAEo (kcal/mol) 
for the W2-1 test set. 
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0.19 


0.18 


0.29 






HNO 


0.40 


0.41 


0.68 


0.41 


0.69 


CO2 


1.64 


1.67 


1.68 


1.12 


1.88 


H2O 


0.37 


0.37 


0.36 


0.20 


0.39 


H2S 


0.34 


n 91^ 


n OA 






HOCl 


0.31 


0.29 


0.50 






ocs 


1.68 


1.58 


1.49 






CICN 


1.76 


1.71 


1.73 






SO2 


0.67 


0.78 


0.68 






CH3 


1.04 


1.04 


0.89 


0.37 


0.84 


NH3 


0.62 


0.64 


0.49 


0.29 


0.62 


PH3 


0.30 


0.22 


0.35 






C2H2 


2.44 


2.34 


2.38 


1.17 


2.17 


CH2O 


1.25 


1.26 


1.44 


0.65 


1.24 


CH4 


1.21 


1.21 


1.19 


0.48 


1.01 


C2H4 


2.36 


2.27 


2.38 


1.02 


2.02 


Mean Absolut 


e Deviation 


0.04 


0.12 


0.39 


0.19 


Max. Absolute Deviation 


0.11 


0.33 


1.34 


0.34 


CeHg 




7.09 


7.13 




6.30 



See Ref. 1 for details. 
^ See Ref. 60 for details. 
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any explicit connectivity information. The different approximate treat- 
ments of inner-shell correlation are compared with large-scale CCSD(T) 
results for the W2-1 set in Table 2.3. As seen there, while the CPP ap- 
proach is indeed quite promising (clearly superior to MP2 calculations, 
for instance), it clearly requires further refinement. The MSFT bond- 
equivalent model in fact outperforms all other approximate methods, 
with a computational cost that is essentially nil. 

4.4. Reduced-Cost Approaches to the Scalar Relativistic 
Correction 

The fact that the additivity model for the scalar relativistic correc- 
tion worked at all is a pleasant surprise: yet alternatives clearly merit 
exploration. As noted above, the SCF-level scalar relativistic contri- 
butions of Kedziora et al. [62] are systematically overestimated. One 
possibility which suggests itself then would be applying a scaling factor 
to the SCF values: we have considered this approach for the set of 120 
molecules for which ACPF/MTsmall data were generated by MSFT for 
the purposes of parameterizing their empirical model. However, rather 
than following the more elaborate approach of Kedziora et al., we sim- 
ply evaluated the first-order Darwin and mass velocity corrections by 
perturbation theory. We considered variation of the basis set, and found 
not surprisingly that typical contracted VnZ basis sets are insufficiently 
flexible in the core region. We found VTZuc-|-l (where VTZuc stands for 
an uncontractcd cc-pVTZ basis set) to be the best compromise between 
cost and quality. 

The best scale factor in the least-squares sense is 0.788; while the 
mean absolute error of 0.04 kcal/mol is more than acceptable, the max- 
imum absolute error of 0.20 kcal/mol (for SO2) is somewhat disappoint- 
ing. Representative results (for the W2-1 set) can be found in Table 
2.4. 

This error can be considerably reduced, at very little cost, by em- 
ploying B3LYP density functional theory instead of SCF. The scale fac- 
tor, 0.896, is much closer to unity, and both mean and maximum abso- 
lute errors are cut in half compared to the scaled SCF level corrections. 
(The largest errors in the 120-molecule data set are 0.10 kcal/mol for P2 
and 0.09 kcal/mol for BeO.) It could in fact be argued that the remain- 
ing discrepancy between the scaled B3LYP/cc-pVTZuc-|-l values is on 
the same order of magnitude as the uncertainty in the ACPF/MTsmall 
values themselves. 
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Table 2.4 Comparison of scalar relativistic effect contributions to TAEq (kcal/mol) 
for the W2-1 test set. 



bpecies 


AOFJr' / 


MSI 1 


T ) OTA TT* / 

B3LYr/ 


SCF/ 




TV yrm n 

M i small 


model 


VTZuc-|-l 
scaled 0.896 


VTZuc-|-l 
scaled 0.788 


H2 


0.00 


0.00 


0.00 


0.00 


N2 


-0.11 


-0.14 


-0.15 


-0.16 


O2 


-0.15 


-0.30 


-0.18 


-0.22 


F2 


0.03 


-0.37 


-0.04 


-0.09 


HF 


-0.20 


-0.19 


-0.18 


-0.20 


CH 


-0.03 


-0.05 


-0.04 


-0.04 


CO 


-0.14 


-0.33 


-0.17 


-0.19 


NO 


-0.16 


-0.20 


-0.20 


-0.22 


cs 


-0.15 


-0.29 


-0.21 


-0.25 


so 


-0.31 


-0.27 


-0.34 


-0.40 


HCl 


-0.26 


-0.17 


-0.25 


-0.26 


GIF 


-0.12 


-0.35 


-0.16 


-0.23 


Ci2 


-0.15 


-0.34 


-0.19 


-0.26 


HNO 


-0.24 


-0.28 


-0.27 


-0.29 


CO2 


-0.45 


-0.44 


-0.48 


-0.50 


H2O 


-0.26 


-0.26 


-0.25 


-0.26 


H2S 


-0.41 


-0.43 


-0.39 


-0.40 


HOCl 


-0.28 


-0.43 


-0.31 


-0.37 


ocs 


-0.53 


-0.41 


-0.57 


-0.57 


CICN 


-0.43 


-0.40 


-0.47 


-0.47 


SO2 


-0.71 


-0.61 


-0.79 


-0.90 


CH3 


-0.17 


-0.14 


-0.17 


-0.16 


NH3 


-0.25 


-0.24 


-0.25 


-0.24 


PHs 


-0.46 


-0.60 


-0.45 


-0.46 


C2H2 


-0.27 


-0.31 


-0.28 


-0.26 


CH2O 


-0.32 


-0.32 


-0.33 


-0.34 


CH4 


-0.19 


-0.19 


-0.19 


-0.18 


C2H4 


-0.33 


-0.34 


-0.33 


-0.31 


Mean Absolute 


Deviation 


0.08 


0.03 


0.05 


Max. Absolute Deviation 


0.40 


0.08 


0.20 
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4.5. Wlc Theory 

Here we propose a new reduced-cost variant of Wl theory which we 
shall denote Wlc (for "cheap"), with Wlch theory being derived anal- 
ogously from Wlh theory. Specifically, the core correlation and scalar 
relativistic steps are replaced by the approximations outlined in the pre- 
vious two sections, i.e. the MSFT bond additivity model for inner-shell 
correlation and scaled B3LYP/cc-pVTZuc-|-l Darwin and mass- velocity 
corrections. Representative results (for the W2-1 set) can be seen in 
Table 2.1; complete data for the molecules in the G2-1 and G2-2 sets are 
available through the World Wide Web as supplementary material [63] 
to the present paper. 

As seen in Table 2.1, Wlc is an acceptable "fallback solution" for 
systems for which Wl calculations arc not feasible because of the number 
of inner-shell orbitals; for heats of formation and certainly for ionization 
potentials, Wlch offers a significant further cost reduction over Wlh at 
a negligible loss in accuracy. 

4.6. Detecting Problems 

While CCSD and especially CCSD(T) are known [36] to be less 
sensitive to nondynamical correlation effects than low-order perturba- 
tion theoretical methods, some sensitivity remains, and deterioration of 
Wl and W2 results is to be expected for systems that exhibit severe 
nondynamical correlation character. A number of indicators exist for 
this, such as the Ti diagnostic of Lee and Taylor [64], the size of the 
largest amplitudes in the converged CCSD wavefunction, and natural 
orbital occupations of the frontier orbitals. 

One pragmatic criterion which we have found to be very useful is 
the percentage of the TAE that gets recovered at the SCF level. For 
systems that are wholly dominated by dynamical correlation, like CH4 
and H2, this proportion exceeds 80 %, while it drops to 50 % for the 
N2 molecule, O2 is only barely bound at the SCF level, and F2 is even 
metastable. In the W1/W2 validation paper [26], we invariably found 
that large deviations from what appeared to be reliable experimental 
data tend to be associated with strong nondynamical correlation, and a 
smaU SCF component of TAE (e.g. 27 % for NO2, 32 % for F2O, and 
15 % for CIO). 

Would the use of full CCSDT [65] energies, instead of their quasi- 
perturbative-triples CCSD(T) counterparts, solve the problem? Our 
experience has taught us that this generally leads to a deterioration of 
the results; it has been shown (e.g. [66]) that the excellent performance 
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of CCSD(T) for binding energies is at least in part due to error compen- 
sation between partial neglect of higher-order T3 effects and complete 
neglect of T4 effects. Unfortunately, explicit treatment of T4 (connected 
quadruple excitations) is at present not feasible for practical-sized sys- 
tems. 

For some very small systems (e.g. Be2 [67] and OH/OH~ [68]), we 
have considered what one might term WICAS and W2CAS, in which 
the CCSD(T) calculations were replaced by full valence (or larger) CAS- 
ACPF calculations. The SCF extrapolation was then applied to the 
CASSCF (i.e. Hartrce-Fock plus static correlation) energy, and the 
CCSD/CCSD(T) extrapolation to the dynamical correlation energy only. 
Aside from limited applicability due to the explosive increase in the num- 
ber of reference configurations with the number of atoms, the formal 
objection of course applies that any separation between "internal" and 
"external" orbital spaces is to a large extent arbitrary. 

Common sense also suggests that the larger the " gap" being bridged 
by the extrapolation from the actual computed number with the largest 
basis set to the hypothetical basis set limit, the larger the uncertainty 
in the latter will be. (See the example of benzene in section 5.3.) 

Finally, the GIGO ("garbage in, garbage out") theorem applies here 
as well as in any other matter. For instance, if a B3LYP/cc-pVTZ+l 
reference geometry is used for a system where the B3LYP geometry 
is known to be qualitatively wrong (such as 0114"), the computed Wl 
energetics will not be very reliable either. 

5. EXAMPLE APPLICATIONS 

5.1. Heats of Vaporization of Boron and Silicon 

First-principle computation of gas-phase molecular heats of forma- 
tion by definition requires the gas-phase heats of formation of the ele- 
ments: 

AHf°T(XfcYr • •) - kAHf°T(X)-lAHf°T(Y)---- 

= ET(XfeYr--) + RT(l-k-l ) -kET(X) -1Et(Y) . 

(5.1) 

Somewhat disappointingly, the values of AH£[A(g')] of some first- 
and second-row elements A (notably boron and silicon) are not precisely 
known because of a variety of experimental difficulties. However, well- 
established precise heats of formation of BFs{g) [69] and SiF4 [70] are 
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available that do not involve the heats of vaporization of boron and sili- 
con in their determination. Thus, if accurate computed TAEq values of 
BF3 and SiF4 were available, then, in combination with the established 
value [71] of Do(F2), the quantities sought for could be derived from a 
thermochemical cycle. These were obtained by means of W2 theory for 
BF3 [72] and for SiF4 [73]. The final recommended values arc AH^o[B(5)] 
= 135.1±0.75 kcal/mol and AH°Q[Si(5)] = 107.15±0.38 kcal/mol. The 
boron value is about 2 kcal/mol higher than the CODATA recommended 
value and in between a recent evaluation by Hildenbrand [74] and a 1977 
measurement by Storms and Mueller [75]. The silicon value is slightly 
higher than the CODATA recommended value, and with a much smaller 
uncertainty. We note in passing that one of the first arguments for revi- 
sion of AHf q[B(5)] and AHf Q[Si(5)] was given in [76] on computational 
(CBS-Q) grounds. 

5.2. Validating DFT Methods for Transition States: 
the Walden Inversion 

It is well known (e.g. [77, 78]) that the prediction of reaction barrier 
heights is one of the main "Achilles' heels" of density functional theory. 
For instance [79], for the prototype Sjv2 reaction, 

X- + CH3Y ^ CH3X + Y- , (5.2) 

B3LYP predicts a negative overall barrier if X = Y = CI (i.e. a barrier 

between the entry and exit ion-molecule complexes that lies below the 
entrance channel). Adamo and Barone [79] demonstrated that their 
new mPWlPW91 (modified Perdew-Wang) functional at least yields 
the correct sign for this problem. 

In Rcf. 80 we carried out a Wl and W2 investigation for all six cases 
with X,Yg{F, CI, Br}, in order to assess the performance of a number 
of DFT exchange-correlation functionals. W2 is in excellent agreement 
with experiment where reliable experimental data are available; in some 
other cases, the Wl calculations cither suggest revisions or provide the 
only reliable data available (see Ref. 80 for details). 

Of the different exchange-correlation functionals considered, the 
new mPWlK [59] functional of Truhlar and coworkers appears to yield 
the best performance among "hybrid" functionals (i.e. those including 
a fraction of exact exchange), followed by BH&HLYP (a half-and-half 
mixture [58] of Hartrcc-Fock and Bcckc 1988 exchange [81] with Lee- 
Yang-Parr correlation). Among "pure DFT" functionals, the best per- 
formance is delivered by HCTH-120 [82] (the 120-molecule reparameter- 
ization of the Hamprecht-Cohen-Tozer-Handy functional). (We note in 



Wl and W2 theories 



59 



passing that this latter functional was parameterized entirely against ab 
initio data.) The G2 data of Pross et al. [83], despite some quantitative 
discrepancies, is qualitatively in perfect agreement with Wl theory. 

We also note that in one case (F, Br) it was impossible to obtain 
all required stationary points at the B3LYP level, since the F- ■ -CHaBr 
minimum does not show up at all at this level. Only mPWlK and 
BH&HLYP find this stationary point, as does CCSD(T). 

5.3. Benzene as a "Stress Test" of the Method 

As an illustrative example of " stress-testing" Wl and W2 theory, we 
shall consider the benzene molecule [86]. The most accurate calculation 
we were able to carry out is at the W2h level: the rate-determining step 
was the direct CCSD/cc-pV5Z calculation (30 electrons correlated, 876 
basis functions, carried out in the subgroup of -De/i) which took 
nearly two weeks on an Alpha EV67/667 MHz CPU. Relevant results 
are collected in Table 2.5. 

At first sight, the disagreement between the computed W2h value of 
AHf = 23.0 kcal/mol and the experimental value of 24.0=b0.2 kcal/mol 
seems disheartening. (Note that it "errs" on the other side as the most 
recent previous benchmark calculation [53], 24.7ib0.3 kcal/mol, using 
similar-sized basis sets as Wl theory.) However, the comparison with 
experiment is not entirely "fair" since it neglects the experimental un- 
certainties in the atomic heats of formation required to convert an at- 
omization energy into a heat of formation (or vice versa). Combining 
these with the experimental AHjqj^ leads to an experimentally derived 
TAEq = 1305.7 lb 0.7 kcal/mol, where the uncertainty is dominated by 
six times that in the heat of vaporization of graphite. In other words, 
our calculated TAEq = 1306.7 kcal/mol is only 0.3 kcal/mol removed 
from the upper end of the experimental uncertainty interval. (After all, 
an error of 0.02 % seems to be a bit much to ask for.) 

Secondly, let us consider the "gaps" bridged by the extrapolations. 
For the SCF component, that gap is a very reasonable 0.3 kcal/mol 
(0.03 %), but for the CCSD valence correlation component this rises to 
5 kcal/mol (1.7 %) while for the connected triple excitations contribution 
it amounts to 1 kcal/mol (3.7 % — note however that a smaller basis 
set is being used than for CCSD). It is clear that the extrapolations are 
indispensable to obtain even a useful result, let alone an accurate one, 
even with such large basis sets. 

Inner-shell correlation, at 7 kcal/mol, is of quite nontrivial impor- 
tance, but even scalar relativistic effects (at 1 kcal/mol) cannot be ig- 
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Table 2.5 Individual components in Wlh, Wl, and W2h total atomization energy 
cum heat of formation of benzene. All data in kcal/mol.'' 



Reference geometry 




B3LYP/. 


cc-pVTZ 




CCSD(T)/cc- 


■pVQZ 




Wlh 


Wl 




W2h 




SCF 


VDZ 


1024.19 


A'VDZ 


1024.59 


VTZ 


1042.16 




VTZ 


1042.10 


A'VTZ 


1042.62 


VQZ 


1044.62 




VQZ 


1044.56 


A'VQZ 


1044.84 


V5Z 


1045.30 


old-style 


VooZ 


1044.95 


VooZ 


1045.15 


VooZ 


1045.56 


new-style 


VooZ 


1045.33 


VooZ 


1045.53 


VooZ 


1045.63 


CCSD 


VDZ 


225.94 


A'VDZ 


226.11 


VTZ 


265.49 




VTZ 


265.55 


A'VTZ 


268.44 


VQZ 


280.91 




VQZ 


280.97 


A'VQZ 


282.39 


V5Z 


285.72 




VooZ 


291.08 


VooZ 


291.53 


VooZ 


290.77 


(T) 


VDZ 


18.72 


A'VDZ 


19.64 


VTZ 


24.41 




VTZ 


24.42 


A'VTZ 


24.78 


VQZ 


25.74 




VooZ 


26.55 


VooZ 


26.69 


VooZ 


26.71 


Inner-shell correlation 




7.09 




7.08 




7.10 


Darwin and mass- velocity 


-0.99 




-0.99 




-0.99 


Spin-orbit coupling 




-0.51 




-0.51 




-0.51 


TAEe 




1368.54 




1369.33 




1368.71 


Ezpv 




62.04 




62.04 




62.04 


TAEo 




1306.49 




1307.29 




1306.67 


AH°oj^[C6H6(g)] 




23.18 




22.39 




23.01 


^[11298.15 — Ho] 




-4.24 




-4.24 




-4.24 


AHf°298.1BK[C6H6(g)] 




18.95 




18.15 




18.78 



Lower level TAEq: 1301.9 (G2), 1305.2 (G3), 1303.7 (CBS-QB3), and 1304.3 (CBS-Q) 
kcal/mol. Experiment: AH° „^ [CeHeCg)] = 24.0±0.2 kcal/mol. [J. B. Pedley, Thermo- 
dynamic Data and Structures of Organic Compounds (Thermodynamics Research Cen- 
ter College Station, TX, 1994); Vol. 1.] Tliis standard enthalpy of formation produces 
TAEo = 1305.7±0.7 kcal/mol, where the uncertainty equals ^0.2^ -|- (6 X 0.11)^; 0.11 
kcal/mol being the uncertainty in the CODATA AH°q [C(g)] = 169.98±0.11 kcal/mol 
[69]. (The uncertainty in AH°g [H(g)] is negligible.) ' 



nored. And manifestly, even a 2 % error in a 62 kcal/mol zero-point 
vibrational energy would be unacceptable. 

Let us now consider the more approximate results. While Wlh 
coincidcntally agrees to better than 0.2 kcal/mol with the W2h result, 
Wl deviates from the latter by 0.6 kcal/mol. Note, however, that in 
Wlh theory, the extrapolations bridge gaps of 0.8 (SCF), 10.1 (CCSD), 
and 2.1 (T) kcal/mol, the corresponding amounts for Wl theory being 
0.7, 9.1, and 1.9 kcal/mol, respectively. Common sense suggests that 
if extrapolations account for 13.0 (Wlh) and 11.7 (Wl) kcal/mol, then 
a discrepancy of 1 kcal/mol should not come as a surprise — in fact, 
the relatively good agreement between the two sets of numbers and the 
more rigorous W2h result (total extrapolation: 6.3 kcal/mol) testifies, if 
anything, to the robustness of the method. 
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As for the difference of about 0.4 kcal/mol between the old-style 
and new-style SCF extrapolations in Wlh and Wl theories, comparison 
with the W2h SCF limits clearly suggests the new-style extrapolation 
to be the more reliable one. (The two extrapolations yield basically the 
same result in W2h.) This should not be seen as an indication that the 
+ A/h^ formula is somehow better founded theoretically, but rather 
as an example of why reliance on (aug-)cc-pVDZ data should be avoided 
if at all possible. Users who prefer the geometric extrapolation for the 
SCF component could consider carrying out a direct SCF calculation 
in the "extra large" (i.e. V5Z) basis set and applying the Eqo + A/B^ 
extrapolation to the "medium", "large", and "extra large" SCF data. 

6. CONCLUSIONS AND PROSPECTS 

Wl /W2 theory and their variants would appear to represent a valu- 
able addition to the computational chemist's toolbox, both for applica- 
tions that require high-accuracy energetics for small molecules and as a 
potential source of parameterization data for more approximate meth- 
ods. The extra cost of W2 theory (compared to Wl theory) does appear 
to translate into better results for heats of formation and electron affini- 
ties, but does not appear to be justified for ionization potentials and 
proton affinities, for which the Wl approach yields basically converged 
results. Explicit calculation of anharmonic zero-point energies (as op- 
posed to scaling of harmonic ones) does lead to a further improvement in 
the quality of W2 heats of formation; at the Wl level, the improvement 
is not sufficiently noticeable to justify the extra expense and difficulty. 

Of the various reduced-cost variants introduced in this paper, W2h 
performs basically as accurately as to W2 for heats of formation. Like- 
wise, Wlh is essentially as good as Wl theory for ionization potentials, 
and almost as good for heats of formation. Neither method is recom- 
mended for electron affinities. 

In systems where a large number of inner-shell electrons makes the 
inner-shell correlation (and, to a lesser extent, scalar relativistic) steps 
in Wl and W2 theory unfeasible, the use of a bond equivalent model 
for the inner-shell correlation and scaled B3LYP/cc-pVTZuc+l scalar 
relativistic corrections offers an alternative under the name of Wlc and 
Wlch theories. 

One plan for the future is the extension to heavier element systems; 
the first step in this direction has been made recently with the devel- 
opment of the SDB-cc-pVnZ valence basis sets [84] (for use with the 
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Stuttgart-Dresden-Bonn relativistic ECPs [85]) for third- and fourth- 
row main group elements. 

Further improvement of accuracy, as weh as apphcabihty to sys- 
tems exhibiting nondynamical correlation, will almost certainly require 
some level of treatment of connected quadruple excitations. 
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